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SERUM BIOMARKERS IN LUNG CANCER 

BACKGROUND OF THE INVENTION 

(0001] The present invention relates generally to the field of serum biomarkers in 
king carcinoma. More particularly, the invention relates to serum biomarkers that can 
distinguish lung cancer from, normal. 

[0002] Lung cancer is the leading cause of cancer death worldwide, resulting in 
150,000 deaths per year in the United States. The mortality rate from lung cancer is 
greater than the combined mortality from breast, prostate and colorectal cancers. On 
the basis of morphology, lung cancer can be broadly classified into four main 
categories namely, adenocarcinoma, squamous cell carcinoma, large cell 

undifferentiated carcinoma and small cell carcinoma. In Hong Kong from 1990 to 

v 

1996, the proportions for adenocarcinoma, squamous cell carcinoma, large cell 
undifferentiated carcinoma and small cell carcinoma are 45.5%, 27.5%, 4.7% and 
10.3% respectively. Both squamous cell carcinoma and small cell carcinoma are 
strongly associated with a smoking history. 

[0003] Adenocarcinoma, squamous cell carcinoma, and large cell undifferentiated 
carcinoma are usually referred as "non-small cell carcinoma." They are relatively 
chemo-resistant, and hence the mainstay of treatment is surgery. By contrast; small 
cell carcinoma has a higher propensity for distant metastases and is mainly treated by 
chemotherapy. 

[0004] Biopsy can be used to diagnose lung cancer, but it is an invasive procedure 
and, therefore, less than desirable. Other diagnostic methods for lung cancer include 
ultrasound and computed tomography (CT) scan. 

[0005] It would be highly desirable to have a biomarker or combination of 
biomarkers capable of distinguishing between lung cancer and normal cells. In 
addition, a simple test could aid in tracking treatment progress and even identify 
molecular targets for therapy. The literature on lung cancer diagnosis has not 
disclosed heretofore such a biomarker or combination of biomarkers, however. 
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SUMMARY OF THE INVENTION 

[0006] In accordance with the present invention, biomarkers and combinations of 
biomarkers are used to identify lung cancer. The method successfully distinguishes 
between lung cancer and normal states, and can be used to identify the particular type 
of lung cancer. In one embodiment, a method for qualifying lung carcinoma status in 
a subject (e.g., a patient) comprises analyzing a biological sample from the subject for 
one or more of the top 50 biomarkers as shown in Figure 2 or Figures 4A and 4B. 
Thus, to assess overall lung cancer risk versus normal, a biomarker is selected from 
the group consisting of 

(A) IM-522, IM-273, IM-520, IM-519, IM-454, IM-507> IM-521 , IM-148, 
IM-266, IM-537, IM-471, IM-510, IM-544, IM-474, IM-155, IM-157, IM- 
176, IM-445, IM-177, IM-440, IM-468, IM-438, IM-547, IM-359, IM- 
436, IM-106, IM-455, IM-444, IM-158, IM-265, IM-50, IM-159, IM-156, 
IM-439, IM-157, IM-508, IM-514, IM-478, IM-473, IM-360, IM-435, IM- 
150, IM-151, IM-110, IM-51, IM-163, IM-437, IM-546, IM-153, and IM- 
268, or 

(B) WM-61, WM-447, WM-446, WM-133, WM-1 19, WM-278, WM-134, 
WM-363, WM-282, WM-362, WM-1 20, WM-290, WM-65, WM-277, 
WM-70, WM-369, WM-1 7, WM-473, WM-47, WM-203, WM-276, WM- 
279, WM-62, WM-366, WM-456, WM-428, WM-384, WM-287, WM- 
420, WM-292, WM-431 , WM-455, WM-20, WM-340, WM-1 05, WM- 
389, WM-63, WM-354, WM-450, WM-466, WM-296, WM-343, WM- 
341, WM-339, WM-55, WM-66, WM-48, WM-38, WM-1 38, and WM- 
310, 

[0007] wherein the biomarker is differentially present in samples of a subject with 
lung cancer and a so-called "normal" subject that is free of lung cancer. 
[0008] More preferably, one or more of the top 15 biomarkers as shown in Figure 2 
or Figures 4A and 4B is used to qualify lung cancer status. Thus, for assessing overall 
lung cancer status versus normal, the protein is selected from the group consisting of 
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(A) IM-522, IM-273, IM-520, IM-519, IM-454, IM-507, IM-521, IM-148, 
IM-266, IM-537, IM-471, IM-510, IM-544, IM-474, IM-155, IM-471, IM- 
510, IM-544, IM-474, and IM-155, or 

(B) WM-61 , WM-447, WM-446, WM-1 33, WM-1 1 9, WM-278, WM-1 34, 
WM-363, WM-282, WM-362, WM-1 20, WM-290, WM-65, WM-277, 
WM-70. 

[0009] Still more preferably, one or more of the top 5 biomarkers as shown in 
Figure 2 or Figures 4A and 4B is used to qualify lung cancer status. In this instance, 
for overall lung cancer status versus normal, the biomarker is selected from the group 
consisting of 

(A) IM-522, IM-273, lM-520, IM-519, and IM-454, or 

(B) WM-61, WM-447, WM-446, WM-1 33, and WM-1 19. 

[0010] In one embodiment, the method measures a plurality of biomarkers. The 
plurality of biomarkers can be measured simultaneously. 

[0011] Biomarkers that, by themselves, are able to identify lung cancer include the 
WM-446 and WM-447 protein biomarkers, and these are particularly preferred. 
[0012] The present invention also provides a method for qualifying lung cancer 
status in a subject (e.g., a patient), comprising (A) providing a spectrum generated by 
subjecting a biological sample from said subject to mass spectroscopic analysis that 
includes profiling on a chemicaUy-derivatized affinity surface, and (B) putting the 
spectrum through pattern-recognition analysis that is keyed to at least one peak 
selected from the top 50 biomarkers as shown in Figure 2 or Figures 4A and 4B. 
Thus, for qualifying overall lung cancer status, the biomarker is selected from the, 
group consisting of 

(i) IM-522, IM-273, IM-520, IM-519, IM-454, IM-507, IM-521, IM-148, 
IM-266, IM-537, IM-471, IM-510, IM-544, IM-474, IM-155, IM-157, IM- 
176, lM-445, IM-177, IM-440, IM-468, IM-438, IM-547, IM-359, IM- 
436, IM-106, IM-455, IM-444, IM-158, IM-265, IM-50, IM-159, IM-156, 
IM-439, IM-157, IM-508, IM-514, IM-478, IM-473, IM-360, lM-435, IM- 
150, IM-151, IM-110, IM-51, IM-163, IM-437, IM-546, IM-153, and IM- 

9fift or 
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(B) WM-61, WM-447, WM-446, WM-1 33, WM-1 19, WM-278, WM-134, 
WM-363, WM-282, WM-362, WM-1 20, WM-290, WM-65, WM-277, 
WM-70, WM-369, WM-1 7, WM-473, WM-47, WM-203, WM-276, WM- 
279, WM-62, WM-366, WM-456, WM-428, WM-384, WM-287, WM- 
420, WM-292, WM-431 , WM-455, WM-20, WM-340, WM-1 05, WM- 
389, WM-63, WM-354, WM-450, WM-466, WM-296, WM-343, WM- 
341, WM-339, WM-55, WM-66, WM-48, WM-38, WM-1 38, and WM- 
310. 

[0013] For assessing the overall lung cancer status, the pattern-recognition analysis 
may, for example, be paired to a pair of peaks selected from the group consisting of 

(A) IM-266 and IM-474, 1M-266 and IM-38, IM-266 and IM-454, IM-266 
and IM-522, IM- 266 and IM-544, IM-266 and IM-471, IM-474 and IM- 
151, IM-474 and IM-156, IM-474 and IM-544, IM-474 and IM-38, IM- 
522 and IM-507, IM-522 and IM-156, and IM-522 and IM-440; 

or 

(B) WM-447 and WM-59, WM-447 and WM-1 9, WM-447 and WM-1 18, 
WM-447 and WM-473, WM-1 9 and WM-59, WM-1 9 and WM-473, WM- 
1 9 and WM-369, WM-61 and WM-1 54, WM-61 and WM-369, WM-1 1 8 
and WM-59 and WM-282 and WM-1 27. 

[0014] More preferably, for assessing overall lung cancer status, the pattern- 
recognition analysis is keyed to a pair of peaks selected from the group consisting of 

(A) IM-266 and IM-474, IM-266 and IM-544, and IM-156 and IM-522; 
or 

(B) WM-447 and WM-59, WM-447 and WM-1 9, and WM-1 9 and WM-59. 
[0015] Alternatively, the pattern-recognition analysis for assessing overall lung 
cancer status may be keyed to a triplet of peaks selected from the group consisting of 

(A) IM-266, IM-454 and IM-474; and IM-266, IM-474 and IM-544; 
or 

(B) WM-447, WM-1 9 and WM-473. 
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[0016] In other embodiments, the pattern-recognition analysis may be keyed to a 
combination of more than three peaks, more particularly to a combination of 4, 5 or 6 
peaks, where the combination is selected from among the combinations shown in 
Tables 1 and 2 herein. 

[0017] In each case, the biomarker is differentially present in samples of a subject 
with lung cancer and a normal subject. 

[0018] The invention also contemplates a kit for detecting and diagnosing lung 
cancer, thereby to assess lung cancer status. Kits within the invention comprise, for 
example, (i) an adsorbent attached to a substrate that retains one or more of the 
biomarkers shown in Figure 2 or Figures 4A and 4B, and (ii) instructions to detect the 
biomarker(s) by contacting a sample with the adsorbent and detecting the 
biomarker(s) retained by the adsorbent. An inventive kit may further comprise a 
washing solution and/or instructions for making a washing solution. The kits may 
include more than type of adsorbent, each present on a different substrate, e.g., on a 
WCX and IMAC biochip. In addition, the kits may comprise one or more containers 
with biomarker samples, to be used as standard(s) for calibration. The substrate 
comprising the adsorbent may be designed to engage a probe interface and, hence, 
function as a probe in gas phase ion spectrometry, preferably mass spectrometry. 
Alternatively, the kit may further comprise a second substrate adapted to engage the 
probe interface, on which the substrate comprising the adsorbent is mounted. 
[0019] The method and kit according to the invention produce an article of 
manufacture in which on6 or more biomarkers according to the invention are bound to 
an adsorbent, optionally contacted with a matrix or energy absorbing molecule. 
[0020] The present invention also provides software for qualifying lung carcinoma 
status in a subject, comprising an algorithm for analyzing data extracted from a 
spectrum generated by mass spectroscopic analysis of a biological sample taken from 
the subject, wherein said data relates to one or more biomarkers according to the 
invention. In one embodiment, the algorithm carries out a pattern-recognition 
analysis that is keyed to data relating to at least one of the biomarkers. In another 
embodiment, the algorithm comprises classification tree analysis that is keyed to data 
relating to at least one of the biomarkers. In yet another embodiment, the algorithm 
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comprises an artificial neural network analysis that is keyed to data relating to at least 
one of the biomarkers. 

[0021] In certain embodiments, the present invention provides methods and kits that 
use serum amyloid a protein or a fragment thereof to qualify lung carcinoma status in 
a subject. In one of these embodiments, the serum amyloid a biomarker has an 
apparent molecular weight of about 2803, 3168, 3277, 3552, 3897, 4300, 4490, 4655, 
5927, 6874, 7776, 7941, 8152, 8952, 9233, 10300, 10866, or 10851 Daltons. In 
another embodiment, the serum amyloid a biomarker has an apparent molecular 
weight of about 3168, 3277, 3552, 3897, 4300, 4490, 4655, 7776, 7941, 8152, 8952, 
or 10851 Daltons. In yet another embodiment, the serum amyloid a biomarker has an 
apparent molecular weight of about 1 1 .5 to 1 1 .7 kD. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0022] Figures 1 A-1D show all biomarkers identified with a Cu(II) EMAC3 
ProteinChip® array format. 

[0023] Figure 2 shows the top 50 biomarkers identified with a Cu(H) IMAC3 
ProteinChip® array format. 

[0024] Figures 3A-30 show all biomarkers identified with a WCX ProteinChip® 
airay format. 

[0025] Figures 4A and 4B show the top 50 biomarkers identified with a WCX 
ProteinChip® array format. 

[0026] Figure 5 shows fragments of serum amyloid A (SAA) that are biomarkers 
according to the present invention. 

[0027] Figure 6 shows identification of SAA biomarkers with an anti-S AA 
antibody. 

[0028] Figures 7-16 are spectra from WCX chips in which all of the top 15 WCX 
marker peaks are labeled, along with various other peaks from among the top 50 
WCX peaks. Red shows spectra from lung cancer patients and gray shows normals. 
[0029] Figures 17-28 are spectra from IMAC chips in which all of the top 15 WCX 
marker peaks are labeled, along with various other peaks from among the top 50 
IMAC peaks. Blue shows spectra from lung cancer patients and gray shows normals. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0030] In accordance with the present invention, a series of biomarkers associated 
with lung cancer has been discovered. In the present context, a biomarker is an 
organic biomolecule, particularly a polypeptide or protein, which is differentially 
present in a sample taken from a subject having lung cancer as compared to a 
comparable sample taken from a normal subject. A biomarker also may be 
differentially present in a sample taken from a subject with one type of lung cancer, 
e.g., small cell carcinoma, as compared to a comparable sample taken from a subject 
with a different type of lung cancer, e.g., adenocarcinoma or squamous cell 
carcinoma, or differentially present at different stages of a type of lung cancer. A 
biomarker is differentially present in samples taken from two groups of subjects if it is 
present at an elevated level or a decreased level in samples of the first group as 
compared to samples of the second group. More particularly, a biomarker is a 
polypeptide that is characterized by an apparent molecular weight, as determined by 
mass spectrometry, and that is present in samples from lung cancer subjects in an 
elevated or decreased level, as compared to subjects that do not have lung cancer. A 
biomarker is differentially present between two sets of samples if the amount of the 
biomarker in one sample set differs in a statistically significant way (p < 0.01) from 
the amount of biomarker in the other sample set. 

[0031] The biomarkers of the invention can be used to assess lung cancer status in a 
subject. For example, they are capable of identifying lung cancer and successfully 
distinguishing it from normal subjects, thereby providing a way of diagnosing the 
presence or absence of lung cancer, including the presence or absence of a particular 
kind of lung cancer. In addition, the biomarkers are useful in assessing the risk of 
developing lung cancer, in staging of lung cancer and in assessing the effectiveness of 
treatment. Thus, "lung cancer status" in the context of the present invention includes, 
inter alia, the presence or absence of disease, the risk of developing disease, the stage 
of the disease, and the effectiveness of treatment of disease. Based on this status, 
further procedures may be indicated, including additional diagnostic tests or 
therapeutic procedures or regimens, such as endoscopy, biopsy, surgery, 
chemotherapy, immunotherapy, and radiation therapy. 
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[0032] In some instances, a single biomarker is capable of identifying lung cancer 
with a sensitivity or specificity of at least 85%, whereas, in other instances, a 
combination or plurality of biomarkers is used to obtain a sensitivity or specificity of 
at least 85%. The biomarkers and combinations of biomarkers thus can be used to 
qualify lung cancer status in a subject or patient. 

[00331 The biomarkers according to the invention are present in serum. The 
biological sample used according to the present invention, however, need not be a 
serum sample. Thus, a biological sample for qualifying lung cancer status may be a 
serum, plasma or blood sample, although serum samples are preferred. 
[0034] All of the biomarkers are characterized by molecular weight. A list of all the 
biomarkers obtained with the Cu(H) IMAC3 ProteinChip® array (Ciphergen 
Biosystems, Inc., Fremont, California, USA) is provided in Figures 1 A-1D, and 
Figure 2 lists the top 50 biomarkers that distinguish between lung cancer and normal 
subjects that are identified by Cu(H) 1MAC3 protocol described herein. Figures 3 A- 
30 comprise a list of all the biomarkers obtained with the WCX2 ProteinChip® array, 
and Figures 4A and 4B comprise a ranking of the top 50 biomarkers that distinguish 
between (i) lung cancer and normal subjects, (ii) subjects with each of four types of 
lung cancer and normal subjects, and (iii) two types of lung cancer, e.g., 
adenocarcinoma versus squamous cell carcinoma, as identified by WCX2 protocol 
described herein. 

[0035] The top 50 biomarkers were determined by decision tree analysis using 
Biomarker Patterns™ software from Ciphergen Biosystems, Inc. Biomarkers other 
than those within the top 50 also are useful in distinguishing between subjects with 
lung cancer and normal subjects and may, in particular, appear in decision trees with 
multiple nodes. In preferred embodiments, one or more of the top 15 biomarkers are 
used, and in even more preferred embodiments, one or more of the top 5 biomarkers 
are used. 

[0036] In each of Figures 1 A-1D and 3A-30, the number in the first column is the 
biomarker identifier. Thus, the first row in Figures 1 A-1D relates to biomarker IM-1, 
the second row relates to biomarker IM-2, and so forth ("IM-" denoting biomarkers 
identified with the IMAC chip). Similarly, the first row in Figures 3 A-30 relates to 
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biomarker WM-1 and the second row relates to biomarker WM-2 CW-" denoting 
biomarkers identified with the WCX2 chip). The number in the second column in 
Figures 1 A-1D is the apparent molecular weight of the biomarker in daltons, as 
determined by mass spectrometry. In Figures 3A-30, the apparent molecular weights 
for the biomarkers identified in the first column are reported in columns 3 through 1 1 . 
The letter in the second column of Figures 1 A-1D and the third column of Figures 
3A-30 denotes the fraction in which the biomarker elutes in the protocol described 
herein; that is, biomarkers with an "A" elute in the first fraction, biomarkers with a 
"B" elute in the second fraction, and so forth. The fraction in which the biomarker 
elutes correlates with its pi, which biomarkers eluting at higher pH having a higher pi, 
and biomarkers eluting at lower pH having a lower pL 

[0037] Presenting the mass and affinity characteristics of a given biomarker within 
the invention, as in this description, characterizes that biomarker so as allow one to 
obtain and measure it, in accordance with the teachings herein. If desired, any of the 
biomarkers can be sequenced, in order to obtain an amino acid sequence, but this is 
not required to practice the present invention. 

[0038] For example, a biomarker can be peptide mapped with a number of enzymes, 
such as trypsin and V8 protease, and the molecular weights of the digestion 
fragments can be used to search databases for sequences that match the molecular 
weights of the digestion fragments generated by the various enzymes. Alternatively, 
if the biomarkers are not proteins included in known databases, degenerate probes can 
be made based on the N-terminal amino acid sequence of the biomarker, which then 
are used to screen a genomic or cDNA library created from a sample from which the 
biomarker was initially detected. The positive clones can be identified, amplified, and 
their recombinant DNA sequences can be subcloned using techniques which are well 
known. Finally, protein biomarkers can be sequenced using protein ladder 
sequencing. Protein ladders can be generated by fragmenting the molecules and 
subjecting fragments to enzymatic digestion or other methods that sequentially 
remove a single amino acid from the end of the fragment. The ladder is then analyzed 
by mass spectrometry. The difference in masses of the ladder fragments identifies the 
amino acid removed from the end of the molecule. 
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[0039] Several biomarkers identified in accordance with the teachings of the present 
invention fit to serum amyloid A (S AA) or to a fragment of S AA. S AA is a well- 
known acute phase inflammat ory marker. A number of the SAA biomarkers are 
identified in Figure 5 by both molecular mass and amino acid sequence. Most of 
these markers bound anti-SAA antibodies, as shown in Figure 6. The intact mass of 
SAA is 1 1 .5 to 1 1 .7 kD, and these biomarkers also have been identified by the present 
methodology-Fragments preferably have a molecular mass of at least about 200 
Daltons, more preferably at least about 500 Daltons. In even more preferred 
embodiments, fragments have a molecular mass of at least about 800 Daltons, and 
most preferably at least about 1 Kilodalton. 

[0040] In one embodiment, the fragments of SAA include a sequence of amino 
acids that is recognized by an epitope of an anti-SAA antibody. One way of 
identifying suitable fragments for use in the present invention is to enzymatically 
digest SAA and test the resulting fragments for the ability to bind to an anti-SAA 
antibody. Fragments that bind anti-SAA antibody can be sequenced using techniques 
well-known in the art, although the sequence of the fragment is not needed to practice 
the invention. In order to practice the invention with a fragment from the enzymatic 
digest that is identified as binding anti-SAA antibody, all that is required is to subject 
to the fragment to mass spectrometry to determine its mass. 

[0041] The serum biomarkers according to the present invention were identified by 
comparing mass spectra of samples derived from sera from two groups of newly- 
diagnosed subjects, subjects with lung cancer and normal subjects. The subjects were 
diagnosed according to standard clinical criteria. Lung cancer subjects were 
histologically confirmed, and subjects without lung cancer were followed for at least 
18 months following serum collection for any sign of lung cancer, to exclude subjects 
with asymptomatic lung cancer. 

[0042] Sera from each group of subjects was collected, and fractionated with Q 
Ceramic HyperDF ion exchange resin (Biosepra SA, France) into six fractions which 
eluted at different pH. Fraction A comprised the flow through plus pH 9 eluant, 
Fraction B comprised the pH 7 eluant, Fraction C comprised the pH 5 eluant, Fraction 
D comprised the pH 4 eluant, Fraction E comprised the pH 3 eluant, and Fraction F 
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comprised isopropyl alcohol/acetonitrile TFA eluant. Fractions A through F are 
identified on Figures 7-28 as Fractions 1 through 6, respectively. 
[0043] Each fraction was diluted and applied to a ProteinChip® array, either a 
Cu(II) IMAC3 or WCX2 chip array. Both of these chip arrays are produced by 
Ciphergen Biosystems, Inc. (Fremont, CA). 

[0044] The Cu(II) IMAC3 is an "immobilized metal affinity-capture" chip, with a 
nitrilotriacetic acid surface for high-capacity copper binding and subsequent affinity 
capture of proteins with metal binding residues. Imidazole may be used in binding 
and washing solutions to moderate protein binding, including binding of non-specific 
proteins. Increasing the concentration of imidazole in the washing buffers reduces the 
binding of the target proteins. It is produced by photopolymerizing 5- 
methylacylamido-2-(N,N-biscarboxymethylamino)pentanoic acid (7.5 wt%) and 
N,N'-methylenebisacrylamide (0.4 wt%) using (-) riboflavin (0.02 wt%) as a 
. photoinitiator. The monomer solution is deposited onto the chip substrate and 
irradiated to photopolymerize. The chip then is activated with Cu(II). 
[0045] The WCX2 is a weak cation exchange array with a carboxylate surface to 
bind cationic proteins. The negatively charged carboxylate groups on the surface of 
the WCX2 chip interact with the positive charges exposed on the target proteins. The 
binding of the target proteins is reduced by increasing the concentration of salt or by 
increasing the pH of the washing buffers. 

[0046] Following application of the eluant fraction, the chips were incubated to 
allow the polypeptides in the eluant to bind to the sites on the chip by an affinity 
interaction. After incubation, each chip array was washed to remove polypeptides 
that bind non-specifically and buffer contaminants. That chip then was dried, and an 
energy absorbing molecule or matrix was applied to it, to facilitate desorption and 
ionization in a mass spectrometer. 

[0047] In the mass spectrometer, retained polypeptides were desorbed from the chip 
array by laser desorption and ionization in a ProteinChip® Reader, which is integrated 
with ProteinChip® Software and a personal computer to analyze proteins captured on 
chip arrays. The ion optic and laser optic technologies in the ProteinChip® Reader 
detects proteins ranging from small peptides of less than 1000 Da up to proteins of 
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300 kilodaltons or more, and calculates the mass based on time-of-flight. Ionized 
polypeptides were detected and their mass accurately determined by this Time-of- 
Flight (TOF) Mass Spectrometry. 

[0048] The mass spectra obtained for each group were subjected to scatter plot 
analysis, to eliminate run-to-run variation. Protein clusters on the scatter plot that had 
the same pattern for both lung cancer and normal subjects, protein clusters that 
were either elevated in both groups of subjects or depressed in both groups of 
subjects, were eliminated as potential biomarkers. The remaining polypeptides were 
further analyzed for their ability to accurately identify subjects with lung cancer. 
Because the molecular weights were derived from scatter plot analysis, and because 
of limits on the ability of mass spectrometry to resolve molecular weights, the 
"absolute" molecular weight values given in Figures 1A-1D and 3A-30 actually 
represent approximate molecular weights. 

[0049] The biomarkers of this invention are characterized by their mass-to-charge 
ratio as determined by mass spectrometry. The mass-to-charge ratio of each 
biomarker is provided in Figures 1A-1D and 3A-30. For example, IM-1 in Figure 1 A 
has a measured mass-to-charge ratio of 201 1. The mass-to-charge ratios were 
determined from mass spectra generated on a Ciphergen Biosystems, Inc. PBS II mass 
spectrometer. This instrument has a mass accuracy of about +/- 0.15 percent. 
Additionally, the instrument has a mass resolution of about 400 to 1000 m/dm, where 
m is mass and dm is the mass spectral peak width at 0.5 peak height. The mass-to- 
charge ratio of the biomarkers was determined using Biomarker Wizard™ software 
(Ciphergen Biosystems). Biomarker Wizard assigns a mass-to-charge ratio to a 
biomarker by clustering the mass-to-charge ratios of the same peaks from all the 
spectra analyzed, as determined by the PBSII, taking the maximum and minimum 
mass-to-charge-ratio in the cluster, and dividing by two. Accordingly, the masses 
provided reflect these specifications. 

[0050] The biomarkers of this invention are further characterized by the shape of 
their spectral peak in time-of-flight mass spectrometry. Mass spectra showing peaks 
representing the biomarkers are presented in Figures 7-28. The biomarker identifier 
numbers from Figures 2 and 4A-4B, respectively, are shown next to the peak, along 
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with their rank, which is indicated in parentheses below the biomarker identifier 
number. 

[0051] The biomarkers of this invention are further characterized by their binding 
properties on chromatographic surfaces. Most of the biomarkers bind to IMAC (Cu) 
or WCX adsorbents (e.g., the Ciphergen® IMAC (Cu) or WCX ProteinChip® arrays) 
after washing as described herein. 

[0052] Thus, a given molecular weight for a biomarker herein should be interpreted 
as the midpoint of a molecular-weight range. The accuracy of the mass spectrometer 
is +/- 0.15%, and the actual molecular weight for a biomarker is therefore the value 
given, +/- 0.15%. For example, the actual molecular weight for biomarker IM-273 is 
11705 +/- 0.15%, or between 11687 and 11722. Often, the range surrounding the 
"absolute" value given in the figure is no more than +/- 5 daltons (2006 to 2016 for 
IM-1), generally no more than +/- 3 daltons (2008 to 2014 for IM-1), and often as 
small as +/- 1 dalton (2010 to 2012 daltons for IM-1). 

[0053] CART® (Salford Systems, San Diego, CA), a classification and regression 
tree software, was used to determine whether a potential biomarker had predictive 
value in assessing lung cancer. A software macro randomly selected a subset of 1 5% 
of the peaks from Figures 1 A-1D or Figures 3A-30. The peaks and peak heights 
from each sample were provided to the CART® software for analysis. The software 
performed an iterative analysis until a single decision tree was generated that was 
capable of distinguishing between cancerous and non-cancerous. Each node in the 
resulting decision tree sorted based on the peak height of a single biomarker. A tree 
may contain any number of nodes, but generally contains from 1 to 6 nodes. From a 
practical standpoint in a commercial diagnostic test, a decision tree with fewer nodes 
is preferred. A total of 2000 decision trees, each based on a different 1 5% subset of 
the peaks from Figures 1 A-1D or Figures 3A-30, were generated. 
[0054] The CART® software assigned a score to each biomarker in the subset, 
based on its relative importance. A score of 100 is very high and a score of 0 is very 
low. The CART® software also determined the sensitivity and specificity of each 
decision tree. 
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[0055] The data generated by the decision tree analysis was subj ected to further 
analysis. The biomarkers were ranked based on their average scores, which were 
determined by adding up a biomarker's scores for each decision tree in which it 
appeared, and dividing by the total number of decision trees in which the biomarker 
appeared. Approximately 500 of the potential biomarkers showed up in at least one 
tree, and most of the biomarkers showed up in about 150 to 400 of the two thousand 
trees. The top 50 biomarkers for the EVLAC and WCX chip arrays as determined by 
this method are shown in Figures 2 and 4A-4B, respectively. 
[0056] All of the trees having sensitivities and specificities greater than 85% also 
were identified. All trees capable of distinguishing lung cancer from normal and 
having from 1 to 6 nodes that meet the 85/85 criterion are shown in Tables 1 and 2. 

TABLE 1 . Decision trees with IMAC Biomarkers. 
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TABLE 2. Decision Trees with WCX Biomarkers. 
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[0057] Each of the biomarker combinations of Tables 1 and 2 are preferred 
combinations for distinguishing lung cancer subjects from normal subjects in 
accordance with the present invention. 

[0058] All biomarkers that appeared in at least two of the trees that met the 85/85 
criterion were identified. For these biomarkers, Tables 3 and 4 provide the number of 
times the biomarker occurred in a trees that met the criterion, as well as the ranking of 
that biomarker on the top 50 lists of Figures 2 and 4A-4B. 

TABLE 3. Correlation of IMAC biomarker decision tree frequencies and 
ranking. 
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TABLE 4. Correlation of WCX biomarker decision tree frequencies and 
ranking. 
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[0059] Biomarkers that occurred frequently in the highly discriminatory trees 
occurred among the top 50 ranked biomarkers, and typically had a top 10 ranking. In 
addition, certain pairs of biomarkers reappear, e.g., WM-447 and WM-59, WM-447 
and WM-19, WM-19 and WM-59, IM-266 and IM-474, IM-266 and IM-38, IM-266 
and IM-454, IM-522 and IM-266. There also are repeats among triplets of 
biomarkers, such as IM-266, IM-266 and IM-38, and WM-447, WM-19 and WM- 
473. Other repeating pairs and trios of biomarkers can be seen in Tables 3 and 4, and 
are preferred. 

[0060] Biomarkers and combinations of biomarkers identified in accordance with 
the present description may be used to qualify lung cancer status in a subject. In 
particular, a biomarker or combination of biomarkers can be used to distinguish lung 
cancer patients from normal patients with a high degree of specificity or sensitivity, 
i.e., greater than at least 85%, preferably greater than at least 90%, and more 
preferably greater than 95%. 

[0061] According to one aspect of the invention, therefore, the detection of 
biomarkers for diagnosis of lung cancer status entails contacting a sample from a 
subject with a substrate, e.g., a SELDI probe, having an adsorbent thereon, under 
conditions that allow binding between the biomarker and the adsorbent, and then 
detecting the biomarker bound to the adsorbent by gas phase ion spectrometry, for 
example, mass spectrometry. Other detection paradigms that can be employed to this 
end include optical methods, electrochemical methods (voltametry and amperometry 
techniques), atomic force microscopy, and radio frequency methods, e.g., multipolar 
resonance spectroscopy. Illustrative of optical methods, in addition to microscopy, 
both confocal and non-confocal, are detection of fluorescence, luminescence, 
chemiluminescence, attsorbance, reflectance, transmittance, and birefringence or 
refractive index (e.g., surface plasmon resonance, ellipsometry, a resonant mirror 
method, a grating coupler waveguide method or interferometry). 
[0062] In one aspect, the markers of this invention are detect by gas phase ion 
spectrometry, which refers to the use of a gas phase ion spectrometer to detect gas 
phase ions. A gas phase ion spectrometer is an apparatus that detects gas phase ions. 
Gas phase ion spectrometers include an ion source that supplies gas phase ions. Gas 
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phase ion spectrometers include, for example, mass spectrometers, ion mobility 
spectrometers, and total ion current measuring devices. 

[0063] "Mass spectrometer" refers to a gas phase ion spectrometer that measures a 
parameter which can be translated into mass-to-charge ratios of gas phase ions. Mass 
spectrometers generally include an ion source and a mass analyzer. Examples of mass 
*5>ectrometers are time-of-flight, magnetic sector, quadrupole filter, ion trap, ion 
cyclotron resonance, electrostatic sector analyzer and hybrids of these. "Mass 
spectrometry" refers to the use of a mass spectrometer to detect gas phase ions. 
"Laser desorption mass spectrometer* 9 refers to a mass spectrometer which uses laser 
as a means to desorb, volatilize, and ionize an analyte. 

[0064] 'Mass analyzer" refers to a sub-assembly of a mass spectrometer that 
comprises means for measuring a parameter which can be translated into mass-to- 
charge ratios of gas phase ions. In a time-of flight mass spectrometer the mass 
analyzer comprises an ion optic assembly, a flight tube and an ion detector. 
[0065] "Ion source" refers to a sub-assembly of a gas phase ion spectrometer that 
provides gas phase ions. In one embodiment, the ion source provides ions through a 
desorption/ionization process. Such embodiments generally comprise a probe 
interface that positionally engages a probe in an interrogatable relationship to a source 
of ionizing energy (e.g., a laser desorption/ionization source) and in concurrent 
communication at atmospheric or subatmospheric pressure with a detector of a gas 
phase ion spectrometer. 

[0066] Forms of ionizing energy for desorbing/ionizing an analyte from a solid 
phase include, for example: (1) laser energy; (2) fast atoms (used in fast atom 
bombardment); (3) high energy particles generated via beta decay of radionuclides 
(used in plasma desorption); and (4) primary ions generating secondary ions (used in 
secondary ion mass spectrometry). The preferred form of ionizing energy for solid 
phase analytes is a laser (used in laser desorption/ionization), in particular, nitrogen 
lasers, Nd-Yag lasers and other pulsed laser sources. 'Tluence" refers to the laser 
energy delivered per unit area of interrogated image. Typically, a sample is placed on 
the surface of a probe, the probe is engaged with the probe interface and the probe 
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surface is struck with the ionizing energy. The energy desorbs analyte molecules 
from the surface into the gas phase and ionizes them. 

[0067) Other forms of ionizing energy for analytes include, for example: (1) 
electrons which ionize gas phase neutrals; (2) strong electric field to induce ionization 
from gas phase, solid phase, or liquid phase neutrals; and (3) a source that applies a 
combination of ionization particles or electric fields with neutral chemicals to induce 
chemical ionization of solid phase, gas phase, and liquid phase neutrals. 
[0068] A preferred mass spectrometric technique for use in the invention is Surface 
Enhanced Laser Desorption and Ionization (SELDI), as described, for example, in 
U.S. patents No. 5,719,060 and No. 6,225,047, both to Hutchens and Yip, in which 
the surface of a probe that presents the analyte (here, one or more of the biomarkers) 
to the energy source plays an active role in desorption/ionization of analyte 
molecules. In this context, "probe" refers to a device adapted to engage a probe 
interface and to present an analyte to ionizing energy for ionization and introduction 
into a gas phase ion spectrometer, such as a mass spectrometer. A probe typically 
includes a solid substrate, either flexible or rigid, that has a sample-presenting surface, 
on which an analyte is presented to the source of ionizing energy. 
[0069] One version of SELDI, called Surface-Enhanced Affini ty Capture" or 
"SEAC," involves the use of probes comprised of a chemically selective surface 
("SELDI probe"). A "chemically selective surface" is one to which is bound either 
the adsorbent, also called a "binding moiety" or "capture reagent," or a reactive 
moiety that is capable of binding a capture reagent, e.g., through a reaction forming a 
covalent or coordinate covalent bond. 

[0070] The phrase "reactive moiety" here denotes a chemical moiety that is capable 
of binding a capture reagent. Epoxide and carbodiimidizole are useful reactive 
moieties to covalently bind polypeptide capture reagents such as antibodies or cellular 
receptors. Nitriloacetic acid and iminodiacetic acid are useful reactive moieties that 
function as chelating agents to bind metal ions that interact non-covalently with 
histidine containing peptides. A "reactive surface" is a surface to which a reactive 
moiety is bound. An "adsorbent" or "capture reagent" can be any material capable of 
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binding a biomarker of the invention. Suitable adsorbents for use in SELDI, 
according to the invention, are described in U.S. patent No. 6,225,047, supra. 
[0071] One type of adsorbent is a "chromatographic adsorbent," which is a material 
typically used in chromatography. Chromatographic adsorbents include, for example, 
ion exchange materials, metal chelators, immobilized metal chelates, hydrophobic 
interaction adsorbents, hydrophilic interaction adsorbents, dyes, simple biomolecules 
(e.g., nucleotides, amino acids, simple sugars and fatty acids), mixed mode 
adsorbents {e.g., hydrophobic attraction/electrostatic repulsion adsorbents). 
"Biospecific adsorbent" is another category, for adsorbents that contain a 
biomolecule, e.g., a nucleotide, a nucleic acid molecule, an amino acid, a polypeptide, 
a polysaccharide, a lipid, a steroid or a conjugate of these {e.g., a glycoprotein, a 
lipoprotein, a glycolipid). In certain instances the biospecific adsorbent can be a 
macromolecular structure such as a multiprotein complex, a biological membrane or a 
virus. Illustrative biospecific adsorbents are antibodies, receptor proteins, and nucleic 
acids. A biospecific adsorbent typically has higher specificity for a target analyte than 
a chromatographic adsorbent. 

[0072] Another version of SELDI is Surface-Enhanced Neat Desorption (SEND), 
which involves the use of probes comprising energy absorbing molecules that are 
chemically bound to the probe surface ("SEND probe"). The phrase "Energy 
absorbing molecules" (EAM) denotes molecules that are capable of absorbing energy 
from a laser desorption ionization source and, thereafter, contributing to desorption 
and ionization of analyte molecules in contact therewith. The EAM category includes 
molecules used in MALDI , frequently referred to as "matrix," and is exemplified by 
cinnamic acid derivatives, sinapinic acid (SPA), cyano-hydroxy-cinnamic acid 
(CHCA) and dihydroxybenzoic acid, ferulic acid, and hydroxyaceto-phenone 
derivatives. The category also includes EAMs used in SELDI, as enumerated, for 
example, by U.S. 5,719,060 and U.S. 60/351,971 (Kitagawa), filed January 25, 2002. 
[0073] Another version of SELDI, called Surface-Enhanced Photolabile Attachment 
and Release (SEPAR), involves the use of probes having moieties attached to the 
surface that can covalently bind an analyte, and then release the analyte through 
breaking a photolabile bond in the moiety after exposure to light, e.g., to laser light. 
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For instance, see U.S. 5,719,060. SEPAR and other forms of SELDI are readily 
adapted to detecting a biomarker or biomarker profile, pursuant to the present 
invention. 

[0074] The detection of the biomarkers according to the invention can be enhanced 
by using certain selectivity conditions, e.g., adsorbents or washing solutions. The 
phrase ' Vash solution" refers to an agent, typically a solution, which is used to affect 
or modify adsorption of an analyte to an adsorbent surface and/or to remove unbound 
materials from the surface. The elution characteristics of a wash solution can depend, 
for example, on pH, ionic strength, hydrophobicity, degree of chaotropism, detergent 
strength, and temperature. 

[0075] Pursuant to one aspect of the present invention, a sample is analyzed by 
means of a "biochip," a term that denotes a solid substrate, having a generally planar 
surface, to which a capture reagent (adsorbent) is attached. Frequently, the surface of 
a biochip comprises a plurality of addressable locations, each of which has the capture 
reagent bound there. A biochip can be adapted to engage a probe interface and, 
hence, function as a probe in gas phase ion spectrometry preferably mass 
spectrometry. Alternatively, a biochip of the invention can be mounted onto another 
substrate to foim a probe that can be inserted into the spectrometer. 
[0076] A variety of biochips is available for the capture of biomarkers, in 
accordance with the present invention, 1 from commercial sources such as Ciphergen 
Biosystems (Fremont, CA), Perkin Elmer (Packard Bioscience Company (Meriden 
CT), Zyomyx (Hayward, CA), and Phylos (Lexington, MA). Exemplary of these 
biochips are those described in U.S. patents No. 6,225,047, supra, and No. 6,329,209 
(Wagner et al.), and in PCT publications WO 99/51773 (Kuimelis and Wagner) and 
WO 00/56934 (Englert et al). 

[0077] More specifically, biochips produced by Ciphergen Biosystems have 
surfaces, presented on an aluminum substrate in strip form, to which are attached, at 
addressable locations, chromatographic or biospecific adsorbents. The surface of the 
strip is coated with silicon dioxide. 

[0078] Illustrative of Ciphergen ProteinChip® arrays are biochips H4, S AX-2, 
WCX-2, and IMAC-3, which include a fonctionalized, cross-linked polymer in the 
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form of a hydrogel, physically attached to the surface of the biochip or covalently 
attached through a silane to the surface of the biochip. The H4 biochip has isopropyl 
functionalities for hydrophobic binding. The SAX-2 biochip has quaternary 
ammonium functionahties for anion exchange. The WCX-2 biochip has carboxylate 
functionahties for cation exchange. The IMAC-3 biochip has nitriloacetic acid 
functionahties that adsorb transition metal ions, such as Cu-H- and Ni++, by chelation. 
These immobilized metal ions, in turn, allow for adsorption ofbiomarkers by 
coordinate covalent bonding. Thus^ Ciphergen's IMAC ProteinChip® arrays are sold 
with reactive moieties that become adsorbent upon the addition by the user of a metal 
solution: 

[0079] In keeping with the above-described principles, a substrate with an adsorbent 
is contacted with the sample, containing serum, for a period of time sufficient to allow 
biomarker that may be present to bind to the adsorbent. In one embodiment of the 
invention, more than one type of substrate with adsorbent thereon is contacted with 
the biological sample. For example, a sample may be applied to both a WCX and an 
IMAC chip. This technique can allow for even more definitive assessment of cancer 

status. After the incubation period, the substrate is washed to remove unbound 

» 

material. Any suitable washing solutions can be used; preferably, aqueous solutions 
are employed. 

[0080] An energy absorbing molecule then is applied to the substrate with the bound 
biomarkers. As noted, an energy absorbing molecule is a molecule that absorbs 
energy from an energy source such as a laser, thereby assisting in desorption of 
biomarkers from the substrate. Exemplary energy absorbing molecules include, as 
noted above, cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid. 
Preferably sinapinic acid is used. 

[0081] The biomarkers bound to the substrates are detected in a gas phase ion 
spectrometer such as a time-of-flight mass spectrometer. The biomarkers are ionized 
by an ionization source such as a laser, the generated ions are collected by an ion 
optic assembly, and then a mass analyzer disperses and analyzes the passing ions. 
The detector then translates information of the detected ions into mass-to-charge 



WO 2004/061410 



24 



PCT/US2003/037090 



ratios. Detection of a biomarker typically will involve detection of signal intensity. 
Thus, both the quantity and mass of the biomarker can be determined. 
[0082] Data generated by desorption and detection of biomarkers can be analyzed 
with the use of a programmable digital computer. The computer program analyzes 
the data to indicate the number of markers detected, and optionally the strength of the 
signal and the determined molecular mass for each biomarker detected. Data analysis 
can include steps of determining signal strength of a biomarker and removing data 
deviating from a predetermined statistical distribution. For example, the observed 
peaks can be normalized, by calculating the height of each peak relative to some 
reference. The reference can be background noise generated by the instrument and 
chemicals such as the energy absorbing molecule which is set as zero in the scale. 
[0083] The computer can transform the resulting data into various formats for 
display. The standard spectrum can be displayed, but in one useful format only the 
peak height and mass information are retained from the spectrum view, yielding a 
cleaner image and enabling biomarkers with nearly identical molecular weights to be 
more easily seen. In another useful format, two or more spectra are compared, 
conveniently highlighting unique biomarkers and biomarkers that are up- or down- 
regulated between samples. Using any of these formats, one can readily determine 
whether a particular biomarker is present in a sample. 

[0084] Software used to analyze the data can include code that applies an algorithm 
to the analysis of the signal to determine whether the signal represents a peak in a 
signal that corresponds to a biomarker according to the present invention. The 
software also can subject the data regarding observed biomarker peaks to 
classification tree or ANN analysis, to determine whether a biomarker peak or 
combination of biomarker peaks is present that indicates lung cancer status. Analysis 
of the data may be "keyed" to a variety of parameters that are obtained either directly 
or indirectly from the mass spectrometric analysis of the sample. These parameters 
include, but are not limited to, the presence or absence of one or more peaks, the 
height of one or more peaks, the log of the height of one or more peaks, and other 
arithmetic manipulations of peak height data. 
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[0085] In another aspect, the present invention provides kits for aiding in the 
diagnosis of lung cancer status, which kits are used to detect biomarkers according to 
the invention. The kits screen for the presence of biomarkers and combinations of 
biomarkers that are differentially present in samples from normal subjects and 
subjects with lung cancer. 

f©086] In one embodiment, the kit comprises a substrate having an adsorbent 
thereon, wherein the adsorbent is suitable for binding a biomarker according to the 
invention, and a washing solution or instructions for making a washing solution, in 
which the combination of the adsorbent and the washing solution allows detection of 
the biomarker using gas phase ion spectrometry, e.g., mass spectrometry. The kit may 
include more than type of adsorbent, each present on a different substrate. 
[0087] In another embodiment, a kit of the invention may include a first substrate, 
comprising an adsorbent thereon, and a second substrate onto which the first substrate 
is positioned to form a probe, which can be inserted into a gas phase ion spectrometer, 
e.g., a mass spectrometer. In another embodiment, an inventive kit may comprise a 
single substrate that can be inserted into the spectrometer. 

[0088] In a further embodiment, such a kit can comprise instructions for suitable 
operational parameters in the form of a label or separate insert. For example, the 
instructions may inform a consumer how to collect the sample or how to wash the 
probe. In yet another embodiment the kit can comprise one or more containers with 
biomarker samples, to be used as standard(s) for calibration. 

[0089] In a preferred embodiment, the detection of biomarkers for diagnosis of lung 
cancer in a subject entails contacting a sample from a subject or patient, preferably a 
serum sample, with a substrate having an adsorbent thereon under conditions that 
allow binding between the biomarker and the adsorbent, and then detecting the 
biomarker bound to the adsorbent by gas phase ion spectrometry, preferably by 
Surface Enhanced Laser Desorption/Ionization (SELDI) mass spectrometry. The 
biomarkers are ionized by an ionization source such as a laser. The generated ions are 
collected by an ion optic assembly and accelerated toward an ion detector. Ions that 
strike the detector generate an electric potential that is digitized by a high speed time- 
array recording device that digitally captures the analog signal. Ciphergen's 
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ProteinChip® system employs an analog-to-digital converter (ADC) to accomplish 
this. The ADC integrates detector output at regularly spaced time intervals into time- 
dependent bins. The time intervals typically are one to four nanoseconds long. 
Furthermore, the time-of-flight spectrum ultimately analyzed typically does not 
represent the signal from a single pulse of ionizing energy against a sample, but rather 
the sum of signals from a number of pulses. This reduces noise and increases 
dynamic range. This time-of-flight data is then subject to data processing. In 
Ciphergen's ProteinChip® software, data processing typically includes TOF-to-M/Z 
transformation, baseline subtraction, high frequency noise filtering. Thus, both the 
quantity and mass of the biomarker can be determined. 

[0090] The detection of the biomarkers can be enhanced by using certain selectivity 
conditions, e.g., adsorbents or washing solutions. In one embodiment, the same or 
similar selectivity conditions that were used to discover the biomarkers are used in the 
method of detecting the biomarker in the sample. For example, immobilized metal 
affinity capture chips such as the Cu(II) BVLAC3 and weak cationic exchange chips 
such as the WCX2 chips are preferred as the adsorbents for biomarker detection. 
However, other adsorbents can be used, as long as they have the binding 
characteristics suitable for binding the biomarkers. 

[0091] More particularly, armed with the information regarding the biomarkers 
identified herein, various methods can be used to recognize patterns of doublets, 
triplets, and higher combinations of biomarkers according to the invention. These 
methods take raw data regarding which peaks are present and their intensity and 
provide a differential diagnosis of lung cancer versus normal for a sample. 
[0092] Thus, the process can be divided into the learning phase and the 
classification phase. In the learning phase, a learning algorithm is applied to a data 
set that includes members of the different classes that are meant to be classified, for 
example, data from a plurality of samples diagnosed as cancer and data from a 
plurality of samples assigned a negative diagnosis. The methods used to analyze the 
data include, but are not limited to, artificial neural network, support vector machines, 
genetic algorithm and self-organizing maps and classification and regression tree 
analysis. These methods are described, for example, in WO 01/31579, May 3, 2001 
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(Bamhill et aL); WO 02/06829, January 24, 2002 (ffitt et aL) and WO 02/42733, May 
30, 2002 (Paulse et aL). The learning algorithm produces a classifying algorithm. 
The classifier is keyed to elements of the data, such as particular markers and 
particular intensities of markers, usually in combination, that can classify an unknown 
sample into one of the two classes. The classifier is ultimately used for diagnostic 
testing. 

[0093] Software, both freeware and proprietary software, is readily available to 
analyze such patterns in data, and to devise additional patterns with any 
predetermined criteria for success. Those biomarkers which by themselves are 
predictive of a differential diagnosis of lung cancer versus normal do not require 
pattern recognition software to analyze the data. 

[0094] The following examples are offered by way of illustration, and are not 
limiting. 

Example L Fractionation of serum 
Buffers: 

1 . U9 (9M urea, 2% CHAPS, 50mM Tris-HCI pH9) 

2. Ul (1M urea, 0.22% CHAPS, 50mM Tris-HCI pH9) 

3. wash buffer 1 : 50mM Tris-HCI with 0.1 % n-octyl D-D-Glucopyranoside 
(OGP) pH9 

4. wash buffer 2: 100mM sodium phosphate with 0.1 % OGP pH7 

5. wash buffer 3: 100mM sodium acetate with 0.1 % OGP pH5 

6. wash buffer 4: 100mM sodium acetate with 0.1 % OGP pH4 

7. wash buffer 5: 50mM sodium citrate with 0.1 % OGP pH3 

8. wash buffer 6: 33.3% isopropanol / 16.7% acetonitrile / 0.1% 
trifluoroacetic acid in water. 

[0095] Thirty microliters of U9 buffer were added to 20\xL of serum in a tube and 
were mixed at 4°C for 20 minutes. Ion exchange resin (Q Ceramic HyperDF ion 
exchange resin, Biosepra S A, France) was washed 3 times with 5 bed volumes of 
50mM Tris-HCI pH9 and stored in 50% suspension. To each well of a 96-well filter 
niat* fQfi-wp.ll Silent Scran filter nlate. Lonrodvne membrane. 0.45 micron pore. 
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Nalge Nunc International, USA), 125 pL of ion exchange resin (50% suspension) was 
added on a Biomek 2000 Automation Workstation (Beckman Coulter, Fullerton, CA), 
washed 3 times with 150fxL Ul buffer, and vacuum dried. Urea-treated serum was 
transferred to each well of ion exchange resin. The serum tube was rinsed with 50p.L 
of Ul buffer, which was also transferred to the corresponding well in filter plate. The 
filter plate was mixed on a platform shaker at 4°C for 30 minutes. Flow-through 
fraction was collected in a 96-well plate by vacuum suction (Fraction 1). Then, 
lOO^L of wash buffer 1 was added to each well of filter plate and mixed for 10 
minutes at room temperature. Eluant was collected into the same 96-well plate 
(Fraction 1). Resins in the filter plate were subsequently washed two times each with 
lOO^L wash buffers 2, 3, 4, 5 and 6. Each eluant (total volume of 200nL) was 
cdllected in a 96-well plate (Fractions 2,3,4,5 and 6). 

Example 2. SELDI analysis of fractionated serum 

[0096] ProteinChip® Arrays were set up in 96-well bioprocessors. Buffer delivery 
and sample incubation were performed on a Biomek 2000 Automation Workstation. 
Each serum fraction was analyzed on IMAC3 (loaded with copper) and WCX2 
ProteinChip® Arrays in duplicates. IMAC3 copper and WCX2 arrays (Ciphergen 
Biosystems Inc, Fremont, CA) were equilibrated two times with 150|iL of binding 
buffer (lOOmM sodium phosphate + 0.5M NaCl pH7 for IMAC3, lOOmM sodium 
acetate pH4 for WCX2). Each serum fraction was diluted in the corresponding 
binding buffer (1/5 dilution for IMAC3 and 1/10 dilution for WCX2) and 100jiL was 
applied to each ProteinChip® array. Incubation was performed on a platform shaker 
at room temperature for 30 minutes. Each array was washed three times with 150j.iL 
of corresponding binding buffer and rinsed two times with water. ProteinChip® 
arrays were air-dried. Sinapinic acid matrix (prepared in 50% acetonitrile, 0.5% 
trifluoroacetic acid) was applied to each array. ProteinChip® arrays were read on a 
ProteinChip® PBSII Reader (Ciphergen Biosystems Inc.) A total of 253 laser shots 
were averaged for each array. 

[0097] All publications and patent documents cited in this application are 
incorporated by reference in their entirety for all purposes to the same extent as if 
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each individual publication or patent document were so individually denoted. By their 
citation of various references in this document Applicants do not admit that any 
particular reference is "prior art" to their invention. 



